Mining Associated Text and Images with Dual-Wing Harmoniums

نویسندگان

  • Eric P. Xing
  • Rong Yan
  • Alexander G. Hauptmann
چکیده

We propose a multi-wing harmonium model for mining multimedia data that extends and improves on earlier models based on two-layer random fields, which capture bidirectional dependencies between hidden topic aspects and observed inputs. This model can be viewed as an undirected counterpart of the two-layer directed models such as LDA for similar tasks, but bears significant difference in inference/learning cost tradeoffs, latent topic representations, and topic mixing mechanisms. In particular, our model facilitates efficient inference and robust topic mixing, and potentially provides high flexibilities in modeling the latent topic spaces. A contrastive divergence and a variational algorithm are derived for learning. We specialized our model to a dual-wing harmonium for captioned images, incorporating a multivariate Poisson for word-counts and a multivariate Gaussian for color histogram. We present empirical results on the applications of this model to classification, retrieval and image annotation on news video collections, and we report an extensive comparison with various extant models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Learning in Dual-Wing Harmoniums Applied to Information Retrieval and Genomics

Dual-wing harmoniums is a promising technique for modeling the relationship between heterogeneous data sources, like associated text and images or genetic variations and observed traits. Unsatisfied with contrastive divergence (CD) and maximum likelihood learning, we implemented Bayesian learning in DWH using brief Langevin MCMC approach. We proposed three different types of priors for both inf...

متن کامل

A Bayesian Framework for Learning Shared and Individual Subspaces from Multiple Data Sources

space learning for multi-view data: a large margin approach .WIDE: A real-world web image database from national university of singapore. sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables. A choice model with infinitely many latent features. [6] T. Griffiths and Z. Ghahramani. Infinite latent feature models and the Indian buffet process. nonparametric join...

متن کامل

Bayesian Exponential Family Harmoniums

A Bayesian Exponential Family Harmonium (BEFH) model is presented for topical modeling of text and multimedia data, and for “posterior” latent semantic projection of such data for subsequent data mining tasks. BEFHs are a Bayesian approach to inference and learning with the recently proposed EFH models and their variants, which enables smoothed, robust estimation of the topicattribute coupling ...

متن کامل

Topic Modeling and Classification of Cyberspace Papers Using Text Mining

The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...

متن کامل

A review of text mining approaches and their function in discovering and extracting a topic

Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling.  Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005